Comparing Binarisation Techniques for the Processing of Ancient Manuscripts

نویسندگان

  • Rapeeporn Chamchong
  • Lance Chun Che Fung
  • Kevin Kok Wai Wong
چکیده

Ancient manuscripts have been preserved by many organizations so as to protect these documents and retrieve traditional knowledge. With the advanced computer technology, digitized media is now commonly used to record these documents. One objective of such work is to develop an efficient image processing system that could be used to retrieve knowledge and information automatically from these ancient manuscripts. Binarization is a preprocessing technique used to extract text and characters from the manuscripts. The output is then used for further processes such as character recognition and knowledge extraction. This paper compares different binarization techniques that could be used for processing of ancient manuscripts. The aim is to improve the binarization techniques with the main objective of developing an automated preprocessing technique for ancient manuscript recognition and knowledge extraction.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Apport du traitement des images à la numérisation des documents manuscrits anciens

Image processing is often necessary for extracting the content of ancient documents. We present here techniques for restoring images and removing noise, extracting document structures (separating graphical elements and illustrations from text, extracting text lines) and, when possible, recognizing the textual or musical symbols which may be present in the image. These techniques, which are clas...

متن کامل

Document binarisation using Kohonen SOM

An integrated system for the binarisation of normal and degraded printed documents for the purpose of visualisation and recognition of text characters is proposed. In degraded documents, where considerable background noise or variation in contrast and illumination exists, there are many pixels that cannot be easily classified as foreground or background pixels. For this reason, it is necessary ...

متن کامل

آسیب‌های روانی و بهداشت روان در متون ایرانی میانه

AbstractObjectives:  The present study was designed to trace the literature related to the history of psychopathology and mental health in Middle Persian manuscripts. Method: The method consisted of library research into the hand written manuscripts and the collection of Middle Persian (Pahlavi) texts dating back to some fifteen hundred years ago. Findings: The frequency of the term ravan (psyc...

متن کامل

The Philological Workstation BAMBI (Better Access to Manuscripts and Browsing of Images)

The aim of this project is the design, prototyping and production of advanced tools for interfacing to databases of reproductions of ancient manuscripts. The base material to be dealt with in the project is widely available and will soon have to undergo some treatment for convers ion into more durable forms. This gives an excellent opportunity for the development of more subtle techniques of ha...

متن کامل

Digital Enhancement of Palm Leaf Manuscript Images using Normalization Techniques

Palm leaves were one of the earliest forms of writing media and their use as writing material in South and Southeast Asia has been recorded from as early as the fifth century B.C. until as recently as the late 19th century. Palm leaf manuscripts relating to art and architecture, mathematics, astronomy, astrology, and medicine dating back several hundreds of years are still available for referen...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010